Goto

Collaborating Authors

 shap and lime


Evaluating the stability of model explanations in instance-dependent cost-sensitive credit scoring

Ballegeer, Matteo, Bogaert, Matthias, Benoit, Dries F.

arXiv.org Artificial Intelligence

Instance-dependent cost-sensitive (IDCS) classifiers offer a promising approach to improving cost-efficiency in credit scoring by tailoring loss functions to instance-specific costs. However, the impact of such loss functions on the stability of model explanations remains unexplored in literature, despite increasing regulatory demands for transparency. This study addresses this gap by evaluating the stability of Local Interpretable Model-agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP) when applied to IDCS models. Using four publicly available credit scoring datasets, we first assess the discriminatory power and cost-efficiency of IDCS classifiers, introducing a novel metric to enhance cross-dataset comparability. We then investigate the stability of SHAP and LIME feature importance rankings under varying degrees of class imbalance through controlled resampling. Our results reveal that while IDCS classifiers improve cost-efficiency, they produce significantly less stable explanations compared to traditional models, particularly as class imbalance increases, highlighting a critical trade-off between cost optimization and interpretability in credit scoring. Amid increasing regulatory scrutiny on explainability, this research underscores the pressing need to address stability issues in IDCS classifiers to ensure that their cost advantages are not undermined by unstable or untrustworthy explanations.


SHLIME: Foiling adversarial attacks fooling SHAP and LIME

Chauhan, Sam, Duguet, Estelle, Ramakrishnan, Karthik, Van Deventer, Hugh, Kruger, Jack, Subbaraman, Ranjan

arXiv.org Artificial Intelligence

Post hoc explanation methods, such as LIME and SHAP, provide interpretable insights into black-box classifiers and are increasingly used to assess model biases and generalizability. However, these methods are vulnerable to adversarial manipulation, potentially concealing harmful biases. Building on the work of Slack et al. (2020), we investigate the susceptibility of LIME and SHAP to biased models and evaluate strategies for improving robustness. We first replicate the original COMPAS experiment to validate prior findings and establish a baseline. We then introduce a modular testing framework enabling systematic evaluation of augmented and ensemble explanation approaches across classifiers of varying performance. Using this framework, we assess multiple LIME/SHAP ensemble configurations on out-of-distribution models, comparing their resistance to bias concealment against the original methods. Our results identify configurations that substantially improve bias detection, highlighting their potential for enhancing transparency in the deployment of high-stakes machine learning systems.


Enhancing Multilingual Sentiment Analysis with Explainability for Sinhala, English, and Code-Mixed Content

Rizvi, Azmarah, Thamindu, Navojith, Adhikari, A. M. N. H., Senevirathna, W. P. U., Kasthurirathna, Dharshana, Abeywardhana, Lakmini

arXiv.org Artificial Intelligence

Sentiment analysis is crucial for brand reputation management in the banking sector, where customer feedback spans English, Sinhala, Singlish, and code-mixed text. Existing models struggle with low-resource languages like Sinhala and lack interpretability for practical use. This research develops a hybrid aspect-based sentiment analysis framework that enhances multilingual capabilities with explainable outputs. Using cleaned banking customer reviews, we fine-tune XLM-RoBERTa for Sinhala and code-mixed text, integrate domain-specific lexicon correction, and employ BERT-base-uncased for English. The system classifies sentiment (positive, neutral, negative) with confidence scores, while SHAP and LIME improve interpretability by providing real-time sentiment explanations. Experimental results show that our approaches outperform traditional transformer-based classifiers, achieving 92.3 percent accuracy and an F1-score of 0.89 in English and 88.4 percent in Sinhala and code-mixed content. An explainability analysis reveals key sentiment drivers, improving trust and transparency. A user-friendly interface delivers aspect-wise sentiment insights, ensuring accessibility for businesses. This research contributes to robust, transparent sentiment analysis for financial applications by bridging gaps in multilingual, low-resource NLP and explainability.


EXPLICATE: Enhancing Phishing Detection through Explainable AI and LLM-Powered Interpretability

Lim, Bryan, Huerta, Roman, Sotelo, Alejandro, Quintela, Anthonie, Kumar, Priyanka

arXiv.org Artificial Intelligence

Sophisticated phishing attacks have emerged as a major cybersecurity threat, becoming more common and difficult to prevent. Though machine learning techniques have shown promise in detecting phishing attacks, they function mainly as "black boxes" without revealing their decision-making rationale. This lack of transparency erodes the trust of users and diminishes their effective threat response. We present EXPLICATE: a framework that enhances phishing detection through a three-component architecture: an ML-based classifier using domain-specific features, a dual-explanation layer combining LIME and SHAP for complementary feature-level insights, and an LLM enhancement using DeepSeek v3 to translate technical explanations into accessible natural language. Our experiments show that EXPLICATE attains 98.4 % accuracy on all metrics, which is on par with existing deep learning techniques but has better explainability. High-quality explanations are generated by the framework with an accuracy of 94.2 % as well as a consistency of 96.8\% between the LLM output and model prediction. We create EXPLICATE as a fully usable GUI application and a light Chrome extension, showing its applicability in many deployment situations. The research shows that high detection performance can go hand-in-hand with meaningful explainability in security applications. Most important, it addresses the critical divide between automated AI and user trust in phishing detection systems.


Extending XReason: Formal Explanations for Adversarial Detection

Jemaa, Amira, Rashid, Adnan, Tahar, Sofiene

arXiv.org Artificial Intelligence

Explainable Artificial Intelligence (XAI) plays an important role in improving the transparency and reliability of complex machine learning models, especially in critical domains such as cybersecurity. Despite the prevalence of heuristic interpretation methods such as SHAP and LIME, these techniques often lack formal guarantees and may produce inconsistent local explanations. To fulfill this need, few tools have emerged that use formal methods to provide formal explanations. Among these, XReason uses a SAT solver to generate formal instance-level explanation for XGBoost models. In this paper, we extend the XReason tool to support LightGBM models as well as class-level explanations. Additionally, we implement a mechanism to generate and detect adversarial examples in XReason. We evaluate the efficiency and accuracy of our approach on the CICIDS-2017 dataset, a widely used benchmark for detecting network attacks.


An Approach To Enhance IoT Security In 6G Networks Through Explainable AI

Kaur, Navneet, Gupta, Lav

arXiv.org Artificial Intelligence

Wireless communication has evolved significantly, with 6G offering groundbreaking capabilities, particularly for IoT. However, the integration of IoT into 6G presents new security challenges, expanding the attack surface due to vulnerabilities introduced by advanced technologies such as open RAN, terahertz (THz) communication, IRS, massive MIMO, and AI. Emerging threats like AI exploitation, virtualization risks, and evolving attacks, including data manipulation and signal interference, further complicate security efforts. As 6G standards are set to be finalized by 2030, work continues to align security measures with technological advances. However, substantial gaps remain in frameworks designed to secure integrated IoT and 6G systems. Our research addresses these challenges by utilizing tree-based machine learning algorithms to manage complex datasets and evaluate feature importance. We apply data balancing techniques to ensure fair attack representation and use SHAP and LIME to improve model transparency. By aligning feature importance with XAI methods and cross-validating for consistency, we boost model accuracy and enhance IoT security within the 6G ecosystem.


Model agnostic local variable importance for locally dependent relationships

Bladen, Kelvyn K., Cutler, Adele, Cutler, D. Richard, Moon, Kevin R.

arXiv.org Machine Learning

Global variable importance measures are commonly used to interpret machine learning model results. Local variable importance techniques assess how variables contribute to individual observations rather than the entire dataset. Current methods typically fail to accurately reflect locally dependent relationships between variables and instead focus on marginal importance values. Additionally, they are not natively adapted for multi-class classification problems. We propose a new model-agnostic method for calculating local variable importance, CLIQUE, that captures locally dependent relationships, contains improvements over permutation-based methods, and can be directly applied to multi-class classification problems. Simulated and real-world examples show that CLIQUE emphasizes locally dependent information and properly reduces bias in regions where variables do not affect the response.


An Adaptive End-to-End IoT Security Framework Using Explainable AI and LLMs

Baral, Sudipto, Saha, Sajal, Haque, Anwar

arXiv.org Artificial Intelligence

The exponential growth of the Internet of Things (IoT) has significantly increased the complexity and volume of cybersecurity threats, necessitating the development of advanced, scalable, and interpretable security frameworks. This paper presents an innovative, comprehensive framework for real-time IoT attack detection and response that leverages Machine Learning (ML), Explainable AI (XAI), and Large Language Models (LLM). By integrating XAI techniques such as SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) with a model-independent architecture, we ensure our framework's adaptability across various ML algorithms. Additionally, the incorporation of LLMs enhances the interpretability and accessibility of detection decisions, providing system administrators with actionable, human-understandable explanations of detected threats. Our end-to-end framework not only facilitates a seamless transition from model development to deployment but also represents a real-world application capability that is often lacking in existing research. Based on our experiments with the CIC-IOT-2023 dataset \cite{neto2023ciciot2023}, Gemini and OPENAI LLMS demonstrate unique strengths in attack mitigation: Gemini offers precise, focused strategies, while OPENAI provides extensive, in-depth security measures. Incorporating SHAP and LIME algorithms within XAI provides comprehensive insights into attack detection, emphasizing opportunities for model improvement through detailed feature analysis, fine-tuning, and the adaptation of misclassifications to enhance accuracy.


More Questions than Answers? Lessons from Integrating Explainable AI into a Cyber-AI Tool

Suh, Ashley, Li, Harry, Kenney, Caitlin, Alperin, Kenneth, Gomez, Steven R.

arXiv.org Artificial Intelligence

We share observations and challenges from an ongoing effort to implement Explainable AI (XAI) in a domain-specific workflow for cybersecurity analysts. Specifically, we briefly describe a preliminary case study on the use of XAI for source code classification, where accurate assessment and timeliness are paramount. We find that the outputs of state-of-the-art saliency explanation techniques (e.g., SHAP or LIME) are lost in translation when interpreted by people with little AI expertise, despite these techniques being marketed for non-technical users. Moreover, we find that popular XAI techniques offer fewer insights for real-time human-AI workflows when they are post hoc and too localized in their explanations. Instead, we observe that cyber analysts need higher-level, easy-to-digest explanations that can offer as little disruption as possible to their workflows. We outline unaddressed gaps in practical and effective XAI, then touch on how emerging technologies like Large Language Models (LLMs) could mitigate these existing obstacles.


Provably Stable Feature Rankings with SHAP and LIME

Goldwasser, Jeremy, Hooker, Giles

arXiv.org Artificial Intelligence

Feature attributions are ubiquitous tools for understanding the predictions of machine learning models. However, popular methods for scoring input variables such as SHAP and LIME suffer from high instability due to random sampling. Leveraging ideas from multiple hypothesis testing, we devise attribution methods that correctly rank the most important features with high probability. Our algorithm RankSHAP guarantees that the $K$ highest Shapley values have the proper ordering with probability exceeding $1-\alpha$. Empirical results demonstrate its validity and impressive computational efficiency. We also build on previous work to yield similar results for LIME, ensuring the most important features are selected in the right order.